Entity Extractor — emails, URLs, phones, dates (regex, no LLM)
Pricing
from $3.00 / 1,000 results
Entity Extractor — emails, URLs, phones, dates (regex, no LLM)
Extract structured entities from free text: email addresses, URLs, phone numbers (incl. Japanese formats and full-width digits), dates (ISO, slash, Japanese 年月日) and IP addresses. Deterministic regex extraction with per-kind counts — fast, cheap, no LLM.
Pricing
from $3.00 / 1,000 results
Rating
0.0
(0)
Developer
Shinobu Otani
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Entity Extractor
Extract structured entities from free text with deterministic regexes — fast, cheap, no LLM.
What it does
- Emails, URLs (http/https, trailing punctuation stripped),
phone numbers (international
+XX…and Japanese0X-XXXX-XXXXformats), dates (YYYY-MM-DD,YYYY/M/D,YYYY年M月D日) and IPv4 addresses. - Input is NFKC-normalized first, so full-width Japanese digits and symbols
(
090-1234-5678) extract cleanly. - Each entity kind can be toggled; values are deduplicated by default (first occurrence kept, order preserved).
Input
{"texts": ["Contact info@example.com or 03-1234-5678, see https://example.com on 2026-06-13."],"emails": true,"urls": true,"phone_numbers": true,"dates": true,"ip_addresses": true,"unique": true}
Output (one dataset item per text)
{"emails": ["info@example.com"],"urls": ["https://example.com"],"phone_numbers": ["03-1234-5678"],"dates": ["2026-06-13"],"ip_addresses": [],"counts": {"emails": 1, "urls": 1, "phone_numbers": 1, "dates": 1, "ip_addresses": 0},"total": 4,"index": 0}
Usage
Point it at scraped pages, support tickets, or listings to pull out contact details and dates for CRM enrichment, lead lists, or monitoring pipelines.